128 research outputs found
Learning to Divide and Conquer for Online Multi-Target Tracking
Online Multiple Target Tracking (MTT) is often addressed within the
tracking-by-detection paradigm. Detections are previously extracted
independently in each frame and then objects trajectories are built by
maximizing specifically designed coherence functions. Nevertheless, ambiguities
arise in presence of occlusions or detection errors. In this paper we claim
that the ambiguities in tracking could be solved by a selective use of the
features, by working with more reliable features if possible and exploiting a
deeper representation of the target only if necessary. To this end, we propose
an online divide and conquer tracker for static camera scenes, which partitions
the assignment problem in local subproblems and solves them by selectively
choosing and combining the best features. The complete framework is cast as a
structural learning task that unifies these phases and learns tracker
parameters from examples. Experiments on two different datasets highlights a
significant improvement of tracking performances (MOTA +10%) over the state of
the art
Socially Constrained Structural Learning for Groups Detection in Crowd
Modern crowd theories agree that collective behavior is the result of the
underlying interactions among small groups of individuals. In this work, we
propose a novel algorithm for detecting social groups in crowds by means of a
Correlation Clustering procedure on people trajectories. The affinity between
crowd members is learned through an online formulation of the Structural SVM
framework and a set of specifically designed features characterizing both their
physical and social identity, inspired by Proxemic theory, Granger causality,
DTW and Heat-maps. To adhere to sociological observations, we introduce a loss
function (G-MITRE) able to deal with the complexity of evaluating group
detection performances. We show our algorithm achieves state-of-the-art results
when relying on both ground truth trajectories and tracklets previously
extracted by available detector/tracker systems
Active query process for digital video surveillance forensic applications
Multimedia forensics is a new emerging discipline regarding the analysis and exploitation of digital data as support for investigation to extract probative elements. Among them, visual data about people and people activities, extracted from videos in an efficient way, are becoming day by day more appealing for forensics, due to the availability of large video-surveillance footage. Thus, many research studies and prototypes investigate the analysis of soft biometrics data, such as people appearance and people trajectories. In this work, we propose new solutions for querying and retrieving visual data in an interactive and active fashion for soft biometrics in forensics. The innovative proposal joins the capability of transductive learning for semi-supervised search by similarity and a typical multimedia methodology based on user-guided relevance feedback to allow an active interaction with the visual data of people, appearance and trajectory in large surveillance areas. Approaches proposed are very general and can be exploited independently by the surveillance setting and the type of video analytic tools
A Distributed Outdoor Video Surveillance System for Detection of Abnormal People Trajectories
Distributed surveillance systems are nowadays widely adopted to monitor large areas for security purposes. In this paper, we present a complete multicamera system designed for people tracking from multiple partially overlapped views and capable of inferring and detecting abnormal people trajectories. Detection and tracking are performed by means of background suppression and an appearance-based probabilistic approach. Objects' label ambiguities are geometrically solved and the concept of "normality" is learned from data using a robust statistical model based on Von Mises distributions. Abnormal trajectories are detected using a first-order Bayesian network and, for each abnormal event, the appearance of the subject from each view is logged. Experiments demonstrate that our system can process with real-time performance up to three cameras simultaneously in an unsupervised setup and under varying environmental conditions
From Groups to Leaders and Back. Exploring Mutual Predictability Between Social Groups and Their Leaders
Recently, social theories and empirical observations identified small groups and leaders as the basic elements which shape a crowd. This leads to an intermediate level of abstraction that is placed between the crowd as a flow of people, and the crowd as a collection of individuals. Consequently, automatic analysis of crowds in computer vision is also experiencing a shift in focus from individuals to groups and from small groups to their leaders. In this chapter, we present state-of-the-art solutions to the groups and leaders detection problem, which are able to account for physical factors as well as for sociological evidence observed over short time windows. The presented algorithms are framed as structured learning problems over the set of individual trajectories. However, the way trajectories are exploited to predict the structure of the crowd is not fixed but rather learned from recorded and annotated data, enabling the method to adapt these concepts to different scenarios, densities, cultures, and other unobservable complexities. Additionally, we investigate the relation between leaders and their groups and propose the first attempt to exploit leadership as prior knowledge for group detection
Alignment-based Similarity of People Trajectories using Semi-directional Statistics
This paper presents a method for comparing people trajectories for video surveillance applications, based on semi-directional statistics. In fact, the modelling of a trajectory as a sequence of angles, speeds and time lags, requires the use of a statistical tool capable to jointly consider periodic and linear variables. Our statistical method is compared with two state-of-the-art methods
Generative Adversarial Models for People Attribute Recognition in Surveillance
In this paper we propose a deep architecture for detecting people attributes (e.g. gender, race, clothing ...) in surveillance contexts. Our proposal explicitly deal with poor resolution and occlusion issues that often occur in surveillance footages by enhancing the images by means of Deep Convolutional Generative Adversarial Networks (DCGAN). Experiments show that by combining both our Generative Reconstruction and Deep Attribute Classification Network we can effectively extract attributes even when resolution is poor and in presence of strong occlusions up to 80% of the whole person figure
- …